Duplication-Loss Genome Alignment: Complexity and Algorithm

نویسندگان

  • Billel Benzaid
  • Riccardo Dondi
  • Nadia El-Mabrouk
چکیده

Recently, an Alignment approach for the comparison of two genomes, based on an evolutionary model restricted to Duplications and Losses, has been presented. An exact linear programming algorithm has been developed and successfully applied to the Transfer RNA (tRNA) repertoire in Bacteria, leading to interesting observation on tRNA shift of identity. Here, we explore a direct dynamic programming approach for the Duplication-Loss Alignment of two genomes, which proceeds in two steps: (1) (The Dynamic Programming step) Outputs a best candidate alignment between the two genomes and (2) (Minimum Label Alignment problem) Finds an evolutionary scenario of minimum duplication-loss cost that is in agreement with the alignment. We show that the Minimum Label Alignment is APX-hard, even if the number of occurrences of a gene inside a genome is bounded by 5. We then develop a heuristic which is a thousands of times faster than the linear programming algorithm and exhibits a high degree of accuracy on simulated datasets. The heuristic has been implemented in JAVA and is available on request.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aligning and Labeling Genomes under the Duplication-Loss Model

In this paper we investigate the complexity of two combinatorial problems related to genome alignment, a recent approach to genome comparison based on a duplication-loss model of evolution. The first combinatorial problem, Duplication-Loss Alignment, aims to align two genomes and to explain the unaligned part of the genomes as duplications and losses. The problem has been recently shown to be N...

متن کامل

Ancestral Genome Organization: An Alignment Approach

We present a comparative genomics approach for inferring ancestral genome organization and evolutionary scenarios, based on present-day genomes represented as ordered gene sequences with duplicates. We develop our methodology for a model of evolution restricted to duplication and loss, and then show how to extend it to other content-modifying operations, and to inversions. From a combinatorial ...

متن کامل

An Improved Algorithm for Genome Rearrangements

A remarkable pattern of evolutionary is that many species have closely related gene sequences but differ dramatically in gene order. It raises a new challenge in aligning two genome sequences that we have to consider changes at both the nucleotide level and the locus level such as gene rearrangements, duplication or loss. Finding the series of rearrangements at the same time with changes at nuc...

متن کامل

Progressive Mauve: Multiple alignment of genomes with gene flux and rearrangement

Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of orthology even among closely related organisms. We describe a method to align two or more genomes that have undergone large-scale recombination, particularly genomes that have undergone substantial amounts of gene gain an...

متن کامل

Evolution of Genome Organization by Duplication and Loss: An Alignment Approach

We present a comparative genomics approach for inferring ancestral genome organization and evolutionary scenarios, based on a model accounting for content-modifying operations. More precisely, we focus on comparing two ordered gene sequences with duplicated genes that have evolved from a common ancestor through duplications and losses; our model can be grouped in the class of “Block Edit” model...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013